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Abstract. We consider a Polya urn, started with b black and ui white balls, 
where b > w. We compute the probability that there are ever the same number 
of black and white balls in the urn, and show that it is twice the probability 
of getting no more than w — 1 heads in b + w — 1 tosses of a fair coin. 



An urn contains b black and w white balls, where b > w. A ball is drawn from 
the urn at random, and then replaced with two balls of the same color. This same 
procedure is then repeated indefinitely. What is the probability that the urn ever 
contains the same number of black and white balls? 

The problem involves the famous Polya urn 3, 8, 10, 13 . The solution is not 
obvious, because the probability of a black or white draw is constantly changing, 
and an infinite number of different draw sequences can lead to equalization. The 
purpose of this note is to show that there is a remarkably simple solution: the 
equalization probability is just twice the probability that in b + w — 1 tosses of a 
fair coin, no more than it? — 1 will be heads. 

A probabilist would solve this problem by noting that the draws of the Polya urn 
are exchangeable, in the sense that the probability of drawing any finite sequence 
of black and white balls depends only on the total number of black balls, and 
not on the order in which they are drawn. She would then invoke de Finetti's 
theorem, which states that an exchangeable process is a mixture of independent 
and identically distributed (i.i.d.) processes, which in this case means Bernoulli 
processes, or biased random walks [UE1I9]. In this way, she would reduce the 
equalization problem for the Polya urn to the gambler's ruin problem, whose well- 
known solution dates to the inception of probability theory [4] . Such an approach 
is not without its charms, and also leads to efficient solutions of more difficult 
problems, such as the probability that there are ever k more white than black 
balls. However, it depends on the gambler's ruin results, and it seems worthwhile, 
if possible, to prove the result directly. 

In this note, I provide an elementary proof of the main result, which does not 
depend on the gambler's ruin results. This work was inspired by a recent paper 
of Antal, Ben-Nairn, and Krapivsky [2], who posed the equalization problem while 
working in the context of first-passage theory, and provided a closed form expression 
for the solution. The expressions in the current paper, however, are new. 

Let S n — B n — W n , the excess of black over white balls after n draws. S n is a 
Markov process that starts at Sq = b — w > 0, and at each step either increases 
by one, if a black ball is drawn, or decreases by one, if a white ball is drawn. The 
probabilities of these two events are just B n /N n and W n /N n , respectively, where 
N n = b + w + n is the number of balls in the urn after n draws. The trajectory of 
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S n thus resembles a random walk, except that the probabilities of moving up and 
down change with every step. We are interested in the probability that the path 
ever touches the S'-axis. 

We will need one key fact about the process S n , or equivalently, about B n : the 
fraction of black balls, B n /N n , has a random limit Z, which is given by the Beta^.u, 
distribution: 

(1) Bcta b , w (p) = + ™) r -i ( <p<l), 

T{b)T(w) 

An elementary proof may be obtained by following the reasoning in de Finetti [5J p. 
219], who shows how the Polya urn process may be modeled using random draws 
from the uniform distribution. For a less elementary proof, see [7j §2]. It follows that 
/i„ = S n /N n also converges, to fi = 2Z — 1. We write F^ a (p) — J Q P Bet&b, w (p) dp 
for the distribution function of the beta distribution. 

Theorem. If a Polya urn is started with b black and w white balls, then the prob- 
ability that the number of black and white balls will ever be equal is 2F^ ta (^). 

Proof. Let r be the random time at which the path first touches the boundary 
5 = 0. The probability of equalization is just the probability of the event {r < oo}, 
which can be divided into the two events {r < oo}n{/i > 0} and {r < oo}(~l{^ < 0}. 
(The probability that /x = is zero, because the density in Eq. (QJ is continuous.) 
But these two events have equal probability. Indeed, /x depends only on S T (n) = 
S(t + n), because the initial segment is finite, and has no effect on the mean. At 
time t, the urn has an equal number of black and white balls, so the mean of its 
subsequent trajectory is equally likely to be positive or negative. Furthermore, if 
H < 0, then r < oo. Indeed, if fi < 0, then the path will eventually be below the 
axis, and since it started out above the axis, it must cross at some point. Thus, 

P(t < oo) = 2P({r < co} n < 0}) = 2P({ M < 0}) = 2if ° ta (±). 

The last expression follows from the fact that \i < if and only if p < | . □ 

The equalization probability can also be expressed as a binomial sum, due 
to an interesting connection between the beta and uniform distributions. Let 
J7i, U2 . ■ . , U n be n independent samples from the uniform distribution on [0,1], 
and let Um < U12) < ■ ■ < Ut n \ be the same samples arranged in increasing order. 
(The Uu\ are called the order statistics of the sample.) Then the density of Un^ is 
given by Beta^™, where b + w = n + 1. We refer the reader to [3 Eq. 1.6.7] for the 
simple proof. 

Given this result, the probability that a Betab jU , variable will be less than x is just 
the probability that at least b uniform variates will be less than x, or equivalently, 
that no more than n — b uniform variates will be greater than x. Let Ber p denote 
a Bernoulli random variable, taking the value one with probabibility p, and zero 
otherwise. In symbols, then 

b+w — l b-\-w — 1 

(2) P(Beta M < x) = P{ Xi>b) = P(J2 Y ^< w - 1 )' 

i=0 i=0 

where the Xi and Yi are independent Ber^ and Beri-^ random variables, respec- 
tively. The last expression, with x = 1/2, establishes the following corollary: 
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Corollary. If a Polya urn is started with b black and w white balls, then the prob- 
ability that the number of black and white balls will ever be equal is the same as the 
probability that in b + w — 1 tosses of a fair coin, no more than w — 1 will be heads. 

Eq. ([2]) was first derived computationally by Karl Pearson in f924 [12], for the 
purpose of expressing sums of binomial coefficients in terms of the more easily 
computable beta distribution. See also [4j p. 173], [HJ 8.17.5]. 

Both expressions for the equalization probability are useful. The first, involving 
the beta function, is easily computed numerically. The second can be used to 
establish central limit results, using the deMoivre-Laplace theorem [4j, or large 
deviations results, using Cramer's theorem [9]. This second expression can also be 
written as an explicit sum of binomial coefficients, 

i ^ fb + w- 1\ i (b + w - 1 

3=0 \ J y j=w 

and these forms are useful when either w or b — w are small, respectively. 
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